Simulation method and system for simulating a multi-core hardware platform

ABSTRACT

Embodiments of the invention relate to methods and systems for simulating a multi-core hardware platform the devices of which are modeled by functional or cycle-based models. In order to improve the simulation speed, a computer implemented method utilizes functional models that include an execution time in the reply to a transaction while maintaining the simulation accuracy relative to a cycle-based simulation of the same hardware platform. The execution time indicates an estimated number of cycles of a main clock which the represented device would have required for executing the operation. The simulation system initiates a transaction by a master model to request the execution of an operation by a slave model. The slave model executes the requested operation, and replies to the transaction returning a result of the executed operation to the master model, and where the slave model is a functional model, the execution time.

PRIORITY CLAIM

The present application claims the benefit of Italian Patent ApplicationNo. VI2010A000208, filed Jul. 28, 2010, which application isincorporated herein by reference in its entirety.

TECHNICAL FIELD

Embodiments of the present invention relate to a method for simulating amulti-core hardware platform. Embodiments of the present inventionprovide a method for simulating a multi-core hardware platform includinga plurality of devices, wherein each device can be modeled as functionalmodel or as cycle-based model. Embodiments of the present inventiondefine simulations including functional models or cycle-based modelswhere the functional models are capable of including an execution timein the reply to a transaction.

BACKGROUND

Multi-core hardware platforms are used in a majority of electronicappliances, for instance, in a private or a professional environmentwith the need for a large amount of processing power. Appliances formulti-core hardware platforms may be dedicated devices which areoperating, e.g., in a standalone multimedia device such as a DVD player,Blu-Ray player or hard-drive video recorder, a TV, a multi-channel HIFIsystem, a networking device, a mobile phone, a personal digitalassistant (PDA) and an MP3 player, or general purpose devices such ascomputers and the like. Such appliances demand different functionalitywhich may be realized by the hardware-platform connecting plural devicesor “IP building blocks” with a special functionality via a bus-like or apoint-to-point like data connection. Accordingly, the flow of dataand/or instructions between the different devices or IP building blocksis essential to the functioning of the whole appliance. The terms IP andIP building block are each used herein as a synonym for a device.

During the design phase of such an appliance, a simulation platform isused to validate and to verify the functionality and to evaluate theperformance of the hardware platform. Further, the simulation can alsobe used during testing to compare the simulated result with the resultsproduced by an implementation of the hardware platform. There are alsoother advantageous combinations of a simulation and of a hardwareimplementation, for instance when the functionality of one device isperformed only by the simulation and the other devices, with which thefirst device communicates, are already hardware implemented prototypes.

A hardware platform combines a plurality of hardware devices or IPbuilding blocks. Thus, in a chip, usually multiple devices or IPbuilding blocks are combined. Nevertheless, devices or IP buildingblocks may also be realized as separate chips. For instance, in amulti-chip design the communication between chips representing a deviceor IP building block may be realized via wires on a printed circuitboard.

In order to distinguish the roles of a device or IP building block, thetransaction model can be used. A transaction refers to an operation tobe performed by two devices or IPs. A first device initiates thetransaction and, hence, is called a master IP. The second device onlyresponds to the transaction and, hence, is called a slave IP. Theresponse of the slave IP may require some computations to be performedby the slave IP. Master IPs can for example be: CPUs (“centralprocessing unit”), DMAs (“Direct Memory Access”), hardware acceleratorsand the like. Slave IPs connected to such transaction initiators ormaster IPs are for example: communication buses, network interfaces,memories, caches and the like. Typically, a master IP is connected toplural slave IPs to form a subsystem. However, a slave IP may also beshared amongst plural master IPs.

Further, there exists the additional case of one device or IP buildingblock having the role of a slave IP regarding one transaction and therole of a master IP for a different transaction. This exception ariseswhen the device or IP building block depends on a different device or IPbuilding block to provide additional information. For example, a cachemay depend on a memory whenever the data is not available in the cacheitself. However, as the memory access is managed by the cachetransparent to the CPU, the cache acts in the role of a slave IP for theCPU and in the role of a master IP for the memory. Thus, the two termsmay also be used for the same device or IP building block as in thepresent example.

For simulating the above described hardware platform, each device or IPbuilding block is represented by a model. Thereby, the simulation iscapable of describing the transaction flow between devices or IPbuilding blocks. Accordingly, not only the output data of the simulationcan be used to validate or to verify the hardware platform design butalso the virtual transactions performed between the models represent thecommunication between devices or IP building blocks. Thus, a simulationwith a plurality of models, each model representing a device or IPbuilding block, is advantageous for an accurate reproduction of thehardware platform behavior.

In a simulation, two different types of models can be distinguished,namely a functional model and a cycle-based model.

The functional model only reproduces the function of a device or IPbuilding block, omitting the implementation of relevant details (e.g.,internal state information, a clock cycle representation, a predefinedexecution speed). The functional model is capable of replying to atransaction with an output. In particular, the functional model repliesessentially instantaneously to a transaction initiator. The conceptionof functional models is easy as the implementation usually only consistsof a static mapping of inputs to outputs. However, if the mapping is notstatic (e.g., functionally dependent), the conception of the functionalmodel is more complex.

The cycle-based model is employed for reproducing the observable stateof a device or IP building block in every cycle. Usually, a cycle-basedmodel does not have a deterministic behavior, which means that there isno knowledge of the result of a transaction when the transaction isinitiated. A cycle-based model is commonly designed to first collect allinformation regarding the transaction and then make the transactionprogress up to the completion. Accordingly, a transaction is completedin a series of steps which are clocked corresponding to some clockfrequency. Thus, for each clock cycle, the cycle-based model modifiesits internal state progressing with the transaction. Consequently, acycle-based model can provide a clock cycle accurate representation ofthe behavior of a hardware device implementing a device or IP buildingblock.

As can be seen from the above, each of the two modeling approachesfeatures an inherent design strategy which may be advantageouslyemployed in different simulations. In particular, a simulationconsisting of functional models exhibits a fast simulation speed and iseasier to develop. Such a simulation, however, lacks accuracy incomparison with a cycle-based simulation. In contrast, a simulationconsisting of cycle-based models is more accurate but usually has aslower simulation speed and is more complex to develop.

Thus, considering the advantages and disadvantages of a functionalsimulation and a cycle-based simulation, both simulations can bebeneficially employed at different phases in the design process of ahardware platform. Usually, the functional models are used in an earlydesign phase as functional models are developed more quickly. Duringtesting there is usually the need for more accuracy, so every modelneeds to be rewritten to feature the cycle-based behavior.

The simulation library SystemC is known to provide a simulation enginefor a joined hardware and software design. Commonly, the SystemClanguage is used for modeling clocked processes, namely by a simulationengine scheduling each process according to predefined time requisites.Further, the SystemC language also allows defining processes which aresimilar to functional models, as processes are not continuouslytriggered according to predefined clock cycles. In a simulation onlycontaining blocks defining processes, the execution order of thesimulation is determined by the sequence according to which informationis transmitted between the processes.

The combination of both model types in one simulation results in anon-cycle-accurate simulation result. Further, when simulatingconcurrent transactions, additional precaution measures are needed forensuring that each block operates according to the correct simulationtiming. For this purpose, SystemC provides the concept of wait( )operations. Thereby, the output of a functional model can be postponedfor a variable number of simulation cycles in order to avoid datainconsistencies for concurrent transactions. Accordingly, a processblock with a wait( ) operation behaves from the outside as cycle-basedmodel, only with the difference of the block implementing a functionalmodel.

A detailed description on SystemC is given, for instance, in IEEE Std1666™-2005, “IEEE Standard SystemC® Language Reference Manual,” version2.1, March 2006 (available at http://www.ieee.org and incorporatedherein by reference).

Although the above-described implementation in the SystemC languageallows for combining functional and cycle-based models, theimplementation of functional models with wait( ) operations results indrawbacks. Due to the wait( ) operation, the functional model is madedependent from the simulation engine for scheduling the outputs, and theresponse of a functional model is delayed by the wait( ) operationresulting in the lengthening of the overall simulation time.

SUMMARY

Embodiments of the present invention are a new simulation approach forsimulating a multi-core hardware platform and improving the simulationspeed while maintaining the simulation accuracy relative to a simulationof the same hardware platform, the devices of which are solely modeledby cycle-based models.

Embodiments of the present invention enable the use of functional modelsin a cycle-accurate simulation wherein the functional models stillmaintain functional properties (i.e., of responding immediately to atransaction).

Embodiments of the present invention allow for a flexible combination offunctional and cycle-based models within a simulation.

Functional models are capable of immediately replying to a modelinitiating the transaction. Accordingly, replacing a cycle-based modelwith a functional model can speed up the execution of the simulation.However, functional models cannot be used within a cycle-accuratesimulation where the cycles in the simulation must correspond to thecycles of the simulated hardware platform. Therefore, a first embodimentof the present invention extends purely functional models totimed-functional models capable of including a transaction time in thereply to a transaction. The returned transaction time indicates a delaywhich would have been introduced by a hardware device replying to thetransaction. With the transaction time the transaction result of afunctional model can be accurately timed by aligning and/or delaying thesimulation results with respect to the main clock of the simulation. Incomparison to a simulation of the same hardware platform where thedevices are solely described by cycle-based models (and assuming thesame level of accuracy of the models), this first embodiment of thepresent invention allows for a faster simulation speed due to thefunctional models replying immediately with a same time accuracy.Further, assuming a cycle-accurate description of a device modeled by afunctional model, the provision of a transaction time enables thefunctional model to be used in a cycle-accurate simulation.

Another second embodiment of the present invention is to suggest themodification of the functional model such that it provides the same timeinformation as a cycle-based model. According to this second embodimentof the present invention, the functional model is capable of immediatelyresponding to a transaction with a result and a cycle count indicatinghow long the processing of the transaction would have taken for ahardware device. In other words, the functional model is capable ofproviding sufficient information to a cycle-accurate model acting as thetransaction initiator such that the transaction initiator can alignand/or delay the processing of the received information for acycle-accurate simulation.

In an exemplary embodiment according to the first and second embodimentsof the present invention, the functional model provides an executiontime as an approximation of the transaction time which would have beennecessary for the represented hardware device to execute the viatransaction requested operation.

A further, third embodiment of the present invention is the modificationof the functional model such that the cycle-based model and thefunctional model are interchangeably used in the simulation. In otherwords, the third embodiment adapts the functional and the cycle-basedmodels such that both model types implement the same interface. Thereby,the simulation engine can switch between a functional model indicatingthe transaction time and a cycle-based model depending on an internalstate of the simulation system. Alternately, the third embodimentimplements both a functional and a cycle-based behavior for anoperation. Thereby, the simulation engine is also enabled to determinethe model behavior depending on an internal state of the simulationsystem.

One embodiment of the present invention is a computer-implemented methodfor simulating a multi-core hardware platform including a plurality ofdevices. Each device is represented in the simulation by either afunctional model or a cycle-based model. The simulation system simulatesthe hardware platform by initiating a transaction by a model taking therole of a master model to request the execution of an operation by amodel taking the role of a slave model, by executing the requestedoperation by the slave model, and by replying to the transaction by theslave model returning a result of the executed operation to the mastermodel. In the case where the slave model is a functional model, theslave model in the simulation is adapted to execute the operationrequested by the transaction and immediately reply thereto by returningthe result of the executed operation and information on the executiontime. The execution time indicates an estimated number of cycles of amain clock which the device represented by the functional slave modelwould have required for executing the operation.

In one exemplary implementation, a simulation engine of thecomputer-implemented method schedules the execution of the operationrequested by the transaction and the reply thereto relative to thecycles of a main clock in case where the slave model is a cycle-basedmodel.

Furthermore, the cycle-based models may define different executioncycles. For example, each cycle-based model has a predefined cycle T_(C)which is an integer multiple of the cycle T_(M) of the main clock. Thesimulation engine is scheduling the execution of an operation requestedby a transaction and a reply thereto of each of the cycle-based modelsrelative to the respective cycle T_(C).

The master model may be a cycle-based master model. In this case, uponreceipt of the reply to the transaction including the result and theinformation on the execution time, the master model is suspended for anumber of cycles of the main clock corresponding to the execution timeindicated in the received information.

In another exemplary embodiment of the present invention, the mastermodel is a functional model and the master model is taking the role of aslave model for another master model representing a device of thesimulated hardware platform, the other master model initiating anothertransaction for requesting the execution of an operation by the mastermodel. In this case, upon receipt of the reply to the transactionincluding the result and the information on the execution time, themaster model executes the operation requested by the other transactionand immediately replies thereto returning the result of the execution ofthe different operation and the sum of the received number of cycles andof the estimated number of cycles associated with the execution of theoperation as information on the execution time.

In one exemplary embodiment of the present invention, the simulationengine is adapted to schedule the execution of an operation requested bya transaction and a reply thereto of each of the cycle-based models atdifferent points in time within a cycle of the main clock.

In another exemplary embodiment of the present invention, the resultwhich is returned by a slave model as a reply to a transactionrequesting the execution of an operation indicates one of the followingstates: the COMPLETED state, where the operation is successfullycompleted; the PENDING state, where the operation is pending; and theERROR state, where the execution of the operation results in an error.

In another exemplary embodiment of the present invention, the simulationengine is adapted to suspend a master model upon the master modelreceiving as a reply to a transaction requesting the execution of anoperation of a slave model a result indicating a PENDING state.

Another alternative embodiment of the present invention also provides acomputer-implemented method for simulating a multi-core hardwareplatform including a plurality of devices. Each device is represented inthe simulation by either a functional model and/or a cycle-based model.At least one device of the hardware platform is represented by both afunctional model and a cycle-based model. The functional model and thecycle-based model have a common interface. The simulation systemsimulates the hardware platform by initiating a transaction by a modeltaking the role of a master model to request the execution of anoperation by one of the functional model and the cycle-based modelrepresenting the same device of the hardware platform, by determiningaccording to an internal state of the simulation system which one of thetwo models is used as slave model for the device, by executing therequested operation by the determined slave model, and by replying tothe transaction by the slave model returning a result of the executedoperation to the master model.

A further alternative embodiment of the present invention also providesa computer-implemented method simulating a multi-core hardware platformincluding a plurality of devices. Each device is represented in thesimulation by either a functional model and/or a cycle-based model. Atleast one device of the hardware platform is represented by a modelincluding a cycle-based implementation of an operation and a functionalimplementation of the same operation. The simulation system simulatesthe hardware platform by initiating a transaction by a model taking therole of a master model to request the execution of an operation by amodel taking the role of a slave model, the slave model including acycle-based implementation of the requested operation and a functionalimplementation of the same operation, by determining according to aninternal state of the simulation system which one of the twoimplementations is used by the slave model for executing the requestedoperation; by executing the determined implementation of the requestedoperation by the slave model, and by replying to the transaction by theslave model returning a result of the executed operation to the mastermodel.

In one further exemplary embodiment of the present invention, the slavemodel in the simulation is adapted to execute the operation requested bythe transaction and to immediately reply thereto by returning the resultof the executed operation and information on the execution time in thecase where the slave model is a functional model. The execution timeindicates an estimated number of cycles of a main clock which the devicerepresented by the functional slave model would have required forexecuting the operation.

A further alternative embodiment of the present invention relates to acomputer program for executing a simulation of a multi-core hardwareplatform including a plurality of devices. Each device is represented inthe simulation by either a functional model or a cycle-based model. Thecomputer program when executed on a processor simulates the hardwareplatform by causing a model taking the role of a master model toinitiate a transaction to request the execution of an operation by amodel taking the role of a slave model, by causing the slave model toexecute the requested operation, and by causing the slave model to replyto the transaction returning a result of the executed operation to themaster model. In the case where the slave model is a functional model,the slave model executes the operation requested by the transaction andimmediately replies thereto by returning the result of the executedoperation and information on the execution time. The information on theexecution time indicates an estimated number of cycles of a main clockwhich the device represented by the functional slave model would haverequired for executing the operation.

The computer-readable data medium according to an exemplary embodimentof the present invention stories instructions that, when executed by aprocessor of a simulation system, cause the simulation system tosimulate a multi-core hardware platform including a plurality ofdevices. Each device is represented in the simulation by either afunctional model or a cycle-based model. The instructions cause thesimulation system to simulate the hardware platform by a model takingthe role of a master model initiating a transaction to request theexecution of an operation by a model taking the role of a slave model,by the slave model executing the requested operation, and by the slavemodel replying to the transaction returning a result of the executedoperation to the master model. In the case where the slave model is afunctional model, the slave model executes the operation requested bythe transaction and immediately replies thereto by returning the resultof the executed operation and information on the execution time. Theinformation on the execution time indicates an estimated number ofcycles of a main clock which the device represented by the functionalslave model would have required for executing the operation.

Another exemplary embodiment of the present invention is providing asimulation system including a processor causing the simulation system tosimulate a multi-core hardware platform including a plurality ofdevices, and a memory for storing intermediate simulation results. Eachdevice is represented in the simulation by either a functional model ora cycle-based model. The simulation system simulates the hardwareplatform by a model taking the role of a master model initiating atransaction to request the execution of an operation by a model takingthe role of a slave model, by the slave model executing the requestedoperation, and by the slave model replying to the transaction returninga result of the executed operation to the master model. In the casewhere the slave model is a functional model, the slave model executesthe operation requested by the transaction and immediately repliesthereto by returning the result of the executed operation andinformation on the execution time. The execution time indicates anestimated number of cycles of a main clock which the device representedby the functional slave model would have required for executing theoperation.

A further alternative embodiment of the present invention relates to acomputer program for executing a simulation of a multi-core hardwareplatform including a plurality of devices. Each device is represented inthe simulation by either a functional model and/or a cycle-based model.At least one device of the hardware platform is represented by both afunctional model and a cycle-based model. The functional model and thecycle-based model have a common interface. The computer program whenexecuted on a processor simulates the hardware platform by causing amodel taking the role of a master model to initiate a transaction torequest the execution of an operation by one of the functional model andthe cycle-based model representing the same device of the hardwareplatform, by causing the processor to determine according to an internalstate of the simulation system which one of the two models is used asslave model for the device, by causing the determined slave model toexecute the requested operation, and by causing the slave model to replyto the transaction returning a result of the executed operation to themaster model.

The computer-readable data medium according to an exemplary embodimentof the present invention stores instructions that, when executed by aprocessor of a simulation system, cause the simulation system tosimulate a multi-core hardware platform including a plurality ofdevices. Each device is represented in the simulation by either afunctional model and/or a cycle-based model. At least one device of thehardware platform is represented by both a functional model and acycle-based model. The functional model and the cycle-based model have acommon interface. The instructions cause the simulation system tosimulate the hardware platform by a model taking the role of a mastermodel initiating a transaction to request the execution of an operationby one of the functional model and the cycle-based model representingthe same device of the hardware platform, by the processor determiningaccording to an internal state of the simulation system which one of thetwo models is used as slave model for the device, by the determinedslave model executing the requested operation, and by the slave modelreplying to the transaction returning a result of the executed operationto the master model.

Another exemplary embodiment of the present invention is a simulationsystem including a processor causing the simulation system to simulate amulti-core hardware platform including a plurality of devices, and amemory for storing intermediate simulation results. Each device isrepresented in the simulation by either a functional model and/or acycle-based model. At least one device of the hardware platform isrepresented by both a functional model and a cycle-based model. Thefunctional model and the cycle-based model have a common interface. Thesimulation system simulates the hardware platform by a model taking therole of a master model initiating a transaction to request the executionof an operation by one of the functional model and the cycle-based modelrepresenting the same device of the hardware platform, by the processordetermining according to an internal state of the simulation systemwhich one of the two models is used as slave model for the device; bythe determined slave model executing the requested operation, and by theslave model replying to the transaction returning a result of theexecuted operation to the master model.

A further alternative embodiment of the present invention relates to acomputer program for executing a simulation of a multi-core hardwareplatform including a plurality of devices. Each device is represented inthe simulation by either a functional model and/or a cycle-based model.At least one device of the hardware platform is represented by a modelincluding a cycle-based implementation of an operation and a functionalimplementation of the same operation. The computer program, whenexecuted on a processor, simulates the hardware platform by causing amodel taking the role of a master model to initiate a transaction torequest the execution of an operation by a model taking the role of aslave model, the slave model including a cycle-based implementation ofthe requested operation and a functional implementation of the sameoperation, by causing the processor to determine according to aninternal state of the simulation system which one of the twoimplementations is used by the slave model for executing the requestedoperation, by causing the slave model to execute the determinedimplementation of the requested operation, and by causing the slavemodel to reply to the transaction returning a result of the executedoperation to the master model.

The computer-readable data medium according to an exemplary embodimentof the present invention stores instructions that, when executed by aprocessor of a simulation system, cause the simulation system tosimulate a multi-core hardware platform including a plurality ofdevices. Each device is represented in the simulation by either afunctional model and/or a cycle-based model. At least one device of thehardware platform is represented by a model including a cycle-basedimplementation of an operation and a functional implementation of thesame operation. The instructions cause the simulation system to simulatethe hardware platform by a model taking the role of a master modelinitiating a transaction to request the execution of an operation by amodel taking the role of a slave model, the slave model including acycle-based implementation of the requested operation and a functionalimplementation of the same operation, by the processor determiningaccording to an internal state of the simulation system which one of thetwo implementations is used by the slave model for executing therequested operation; by the slave model executing the determinedimplementation of the requested operation, and by the slave modelreplying to the transaction returning a result of the executed operationto the master model.

Another exemplary embodiment of the present invention is a simulationsystem including a processor causing the simulation system to simulate amulti-core hardware platform including a plurality of devices, and amemory for storing intermediate simulation results. Each device isrepresented in the simulation by either a functional model and/or acycle-based model. At least one device of the hardware platform isrepresented by a model including a cycle-based implementation of anoperation and a functional implementation of the same operation. Thesimulation system simulates the hardware platform by a model taking therole of a master model initiating a transaction to request the executionof an operation by a model taking the role of a slave model, the slavemodel including a cycle-based implementation of the requested operationand a functional implementation of the same operation, by the processordetermining according to an internal state of the simulation systemwhich one of the two implementations is used by the slave model forexecuting the requested operation; by the slave model executing thedetermined implementation of the requested operation, and by the slavemodel replying to the transaction returning a result of the executedoperation to the master model.

BRIEF DESCRIPTION OF THE DRAWINGS

In the following, embodiments of the present invention are described inmore detail referring to the attached figures and drawings. Similar orcorresponding details in the figures are marked with the same referencenumerals.

FIG. 1 schematically shows an example of a multi-core platform and asimulation system to be used for the simulation according to anexemplary embodiment of the present invention,

FIG. 2 schematically shows simplified multi-core platform with shareddevices according to an exemplary embodiment of the present invention,

FIG. 3 illustrates a simplified example of a hardware platform havingonly one initiator according to an exemplary embodiment of the presentinvention,

FIGS. 4A and 4B schematically shows an external interface for atransaction of a master model and of a slave model according to anexemplary embodiment of the present invention,

FIG. 5 illustrates an exemplary procedure for a cycle-based slave modelto reply to a transaction according to an exemplary embodiment of thepresent invention,

FIG. 6 shows an exemplary timing diagram of a simplified “cache-miss”operation of an instruction cache taking the role of a master and aslave model according to an exemplary embodiment of the presentinvention,

FIG. 7 schematically shows the sequence of operations to be performed bya master model upon receipt of a reply to a transaction according to anexemplary embodiment of the present invention.

DETAILED DESCRIPTION

Before describing embodiments of the present invention in more detailbelow, some definitions and conventions that are used in this documentare first defined.

“Device”: The term device relates to a physical entity or logical entityof the hardware platform to be simulated. In some embodiments of thepresent invention, a device is a separate physical unit. However, it isalso possible that a single physical entity is represented by multipledevices. For example, a cache may also be represented by multipledevices, e.g., one device representing the write buffer of the cache,another device representing the cache memory. In fact, the definition ofa device within the simulation and its relation to the real-worldhardware is up to the engineer designing the simulation models. Examplesfor devices are caches, memories, networks, buses, MMUs (memorymanagement unit), etc., or logical or physical sub-units thereof.

“IP”: The term IP (or IP building block) is used as a synonym for adevice herein, as previously mentioned above.

“Simulation System”: The term simulation system refers to a computingapparatus or computing system running the simulation. For example, inone embodiment of the present invention, the simulation system may be ageneral purpose computer. In another embodiment of the presentinvention, the simulation system is realized as any other type ofcomputing apparatus, computing-like apparatus and/or hardware structureincluding at least a CPU, a mass storage component, a memory, and a userinput/output device.

“Simulation Engine”: The term simulation engine refers to a piece ofsoftware for running a simulation. For example, the simulation enginemay be a runtime environment of the simulation system. The tasks of thesimulation engine may, for example, include one or more of thefollowing: defining the start of the simulation; scheduling operations,e.g., transactions to be performed; and determining the termination ofthe simulation. Furthermore, the simulation environment may provide asimulation clock, also referred to as the main clock herein. Allsimulation operations are performed in accordance with the cycles ofthis main clock, e.g., the system clock of the simulation system.

“Cycle-Accurate Simulation”: The term cycle-accurate simulation is usedto describe a simulation that ensures correct simulation results andtiming by simulation models accurately handling transactions relative tothe cycles of main clock. A cycle-accurate simulation is thus accuratelyreflecting the behavior of the simulated devices in terms of results andtime. Each model may initiate a transaction, for example, for requestingan operation to be performed by another model. The initiating model putsthe reply in an accurate time relationship to other transactions, byrelating each transaction reply to the cycles of the main clock.

“Model”: A model represents a device or an IP of the hardware platformto be simulated. As each model may only provide a certain level ofabstraction of the corresponding device or IP, there may be multipledifferent models for the same device or IP. Embodiments of the presentinvention distinguish at least the following types of models:

“Functional/Timed-Functional Model”: A functional model is afunctionally accurate description of the behavior of a device or IPbuilding block to the outside, without modeling of the internalimplementation details of the represented device or IP (e.g., internalstate information, a clock cycle representation, a predefined executionspeed). This facilitates a functional model to reply to a request from amaster model instantaneously. Further, the term “timed-functional” modelindicates that results of the requested operation provided by thefunctional model additionally include a transaction time (i.e., thetransaction time being the time between the reception and the result ofa transaction). The transaction time indicates the time that theexecution of the requested operation and the response of the executionresult would have taken on the simulated device or IP being representedby the timed-functional model. The transaction time may be expressed incycles of the main clock and may be approximated by the execution timeof executing the requested operation.

“Cycle-based model”: A cycle-based model is designed to reproduce theobservable state of a device or an IP block in every cycle. Acycle-based model does not have a deterministic behavior as there is noknowledge of the output/result of a transaction when the transaction isinitiated. Accordingly, a transaction is completed in a series of stepswhich are scheduled, for example, corresponding to the predefined cycleratio of the represented device. Thus, for each predefined cycle, theinternal state of the cycle-based model is modified.

“Transaction”: The term transaction refers to the operations to beperformed between two models. A first model initiates the transaction toa second model and the second model responds to said transaction. Toindicate the role of each model, a model taking the role of thetransaction initiator is called master model, and a model receiving andreplying to the transaction is called slave model. In order to reply toa transaction the slave model may execute computations. As models cantake the role of a master model as well as the role of a slave model fordifferent transactions, the master and the slave property is definedwith respect to a given transaction. The terms master IP and slave IPare used similarly as the terms master model and slave model.

“Immediate reply”: The term immediate reply means that a slave modelresponds to a transaction within one clock cycle of the main clock.Hence, the request for the transaction from a master model and theresponse thereto by the slave model must be provided within the sameclock cycle of the main clock.

Referring now to FIG. 1, an exemplary multi-core hardware platform 100and a simulation system 105 is depicted.

The simulation system 105, shown in FIG. 1, is a computing devicecapable of running a program defining the simulation method as set outbelow. In particular, the simulation system 105 of FIG. 1 is depicted asa general purpose computer only in terms of an illustrative example.Alternatively, the simulation system 105 could be any other kind ofcomputing device, computing-like device and/or hardware structurecomposed by a CPU, a storage medium, a memory, a user input/outputdevice and the like.

As indicated by the arrow, embodiments of the present invention relateto the simulation system running a simulation of models representing thehardware platform 100. Usually, the simulation is provided as a programwritten in a programming language. Accordingly, the simulation methodincludes models which implement the functionality of the representeddevices or IP building blocks. Each model may, for instance, implementan operation which a different model may request to be executed (e.g. amemory model may implement a read( ) function to be executed bydifferent model). Upon receiving of a request for executing anoperation, the model may execute its operation e.g. within its ownnamespace.

In particular, such a request for executing an operation may be realizedas a function call of the operation provided by a model. However, to beformally correct, the above wording has only been introduced forsimplicity. The description should be understood such that the processorof a simulation system executes all operations, and that a simulationengine or a kernel is provided by the simulation method which performsthe scheduling of the execution of operations and other time relatedoperations (e.g. callback mechanism). Nevertheless, a description withmodels executing operations is chosen as it is coherent with theexecution of operations by the (hardware) devices to be simulated.

In the simulation according to embodiments of the present invention, twotypes of models are combined, namely functional models and cycle-basedmodel. Functional models have the advantage of immediately replying to atransaction requesting the execution of an operation (i.e. replyingwithin the same clock cycle of the simulation). This advantage resultsfrom an implementation of a functional model that is not time dependent.Cycle-based models are scheduled according to a predefined cycle by thesimulation engine.

In order to allow the cooperation of the cycle-based models and thefunctional models, the functional models are adapted to reply with aresult to a transaction including a transaction time (i.e. thetransaction time being the time between the reception and the reply of atransaction). However, to implement a simulation with functional modelsreplying with a result including the transaction time, the transactioninitiator models (i.e. master models) have to be adapted. For example,the transaction initiator models may be suspended upon receipt of thetransaction time. In the case where a transaction initiator modelinitiates two transactions: one to a functional slave model (which wouldhave taken e.g. 4 cycles for the execution) and another to a cycle-basedslave model (which takes e.g. 4 cycles for the execution), the temporarysuspending of the transaction initiator model may be the only option forboth results to arrive at the same time.

Further, the suspending of a transaction initiator model receiving replyto a transaction with a transaction time may be realized in cycle-basedslave models. In general, cycle-based models have no deterministicbehavior. Accordingly, while executing the simulation of a cycle-basedmodel, there is no deterministic knowledge of how the model willprogress until completion. Accordingly, the simulation of a cycle-basedmodel is modified to suspend the cycle-based model when receiving atransaction result and a transaction time indicating the completion of atransaction for a future point in time.

Alternately, a transaction initiator model may propagate the receivedtransaction time in an upward direction of transaction dependent models.This concept can be advantageously realized in transaction initiatormodels which are functional models. For instance, in the case of threetransaction dependent functional models, namely a first functional modelinitiating a first transaction to a second functional model whereuponthe second functional model is initiating a dependent transaction to athird functional model, the first functional model may receive a replyto the initiated transaction including time information corresponding tothe sum of the time for the first transaction and of the dependentsecond transaction.

In particular, a functional model immediately responds to a transaction,namely within the same clock cycle. Accordingly, a functional model alsoreceives a transaction result and a transaction time and replies toanother transaction within the same clock cycle. Thus, the sum of thereceive transaction time plus the transaction time for responding to theother transaction corresponds to the cycles of the main clock which thetwo transactions would have taken in the represented (hardware) devicesto be executed.

Furthermore, the simulation according to embodiments of the presentinvention also enables a dynamically changeable cooperation of themodels. Normally, each device of the hardware platform to be simulatedis represented by one model, namely a functional or a cycle-based model.However, there are simulation embodiments of the present invention forwhich one or the other implementation is preferable.

Accordingly, the simulation is capable of handling a functional and acycle-based model representing the same device to be simulated. For thispurpose, the cycle-based and the functional model have the sametransaction interface. The simulation engine dynamically determinesdepending on an internal state, which of the two models is used in thesimulation for representing the device. The internal state may be set bya user for the whole duration of the simulation. Alternately, a user mayalso specify models to be changed depending on a predefined clock cycleof the simulation clock.

Alternately, the simulation is capable of handling a model including afunctional and a cycle-based implementation of the same operation to berequested for execution by a transaction. In this case, the simulationengine dynamically determines according to an internal state which ofthe two implementation of the same operation is executed by the modelupon receiving a request for execution of the operation. The internalstate may be set by a user for the whole duration of the simulation.Alternately, a user may also specify models to be changed depending on apredefined clock cycle of the simulation clock.

Further, the hardware platform 100 of FIG. 1 shows a platform withseveral transactions initiators, in particular a DMA and several otherinitiators, i.e. hardware accelerators or programmable hardwareaccelerators PE (Processing Elements). Furthermore, the hardwareplatform 100, comprises replying devices like for example a BUS, a MainMemory, a NoC (“Network-on-Chip”) and a bridge. As illustrated in FIG.1, the transaction initiators are connected to the replying devices andat least some of the replying devices are shared amongst severalinitiators, this applies for example to the replying device BUS. Thisexemplary multi-processor platform is preferably composed of a GPE, anda regular array of PE processors or hardware accelerators. Eachprocessor (or hardware accelerator) has its own distributed but uniformmemory address space.

The hardware platform 100 of FIG. 1 can be an example of a genericmultimedia streaming multi-core platform, which is becoming common notonly in standalone devices (DVD or Blu-ray players, set-top boxes, etc.)but also in mobile devices (mobile phones, smart phones, etc.).

Turning now to FIG. 2, a basic architecture of a hardware platform asoutlined by FIG. 1 is illustrated in a simplified manner.

Referring now to FIG. 2, a hardware platform is shown as an abstractmodel of electronic components 205 to 250 that are interconnected witheach other by means of data connection which can for example be anelectronic wire, a bus or a network, and the like. As an illustratedexample, the combination of devices and the characteristics of the dataconnections depicted in FIG. 2 have an exemplary character with respectto embodiments of the present invention. Therefore, the principles ofembodiments of the present invention can be applied to any hardwareplatform including different numbers of devices or different kinds ofdata connections.

In FIG. 2, transaction initiators 205/210 are illustrated to connect todevices 207-240. In particular, transaction initiator 205 is connectedto five devices, namely devices 207 to 211, and devices 235 and 240. Thetransaction initiator 205 with its connected devices forms a subsystem250. Similarly, transaction initiator 210 is connected to devices 222 to226 and devices 235 and 240, thus forming a subsystem 260.

Further, devices 211 and 224 have additional connections for which thedevices 211 and 224 take the role of a transaction initiator. Inparticular, device 211 takes the role of a transaction initiator for theconnection to device 245 and device 224 takes the role of a transactioninitiator for the connection to device 250. The connection betweendevices 211/224 and devices 245/250 enable the transaction initiators205/210 to indirectly communicate with devices 245/250. However, asdevices 211/224 are connected in between the transaction initiators205/210 and devices 245/250, the transaction initiators 205/210 cannotdirectly initiate a transaction to devices 245/250.

Depending on the functionality of a model, a model may implement therole of either a transaction initiator, namely master model, or of areplying device, namely slave device, or may alternately implement bothroles, namely the role of a transaction initiator for a first set oftransactions and the role of a replying device for a second set oftransactions.

Referring now to FIG. 3, a simplified model of the hardware platform 300to be simulated is shown. The exemplary hardware platform 300 comprisesa processing element named core model 305, an instruction cache model310 and two memory models 315 and 320. In this example, the memory model320 is optional.

For core model 305 to run a program executing instructions, the coremodel fetches an instruction from a memory model 315 or 320, theinstruction identifying operator and operands of a program. Inparticular, core model 305 includes an instruction pointer registerwhich determines a next instruction to be executed corresponding to asequence of the program. In the described hardware platform,additionally an instruction cache model 310 is provided for speeding-upthe instruction fetch operation of core model 305.

Generally, the instruction cache 310 is optimized for accessing thestored information fast. Thus, in the simulation the cached instructionscan be read faster from the instruction cache model 310 than from thememory model 315 or 320 storing the program. However, an instructioncache model only holds a subset of instructions with respect to thewhole program. Accordingly, upon core model 305 initiating aninstruction fetch operation, the instruction cache model 310 first needsto determine whether the instruction to be fetched is present and/orvalid in instruction cache model 310.

In the case where the instruction to be fetched is present, namely acache-hit, the instruction cache model 310 copies the requestedinstruction to a specified address, for instance, the register of thecore model 305 supplying the next instruction. Thereafter, theinstruction cache model 310 is replying to the instruction fetchoperation of core model 305 indicating a COMPLETED state.

In the case where the instruction to be fetched is not present, namely acache-miss, the instruction cache model 310 redirects the instructionfetch operation to the memory model including the program. For thispurpose, the instruction cache model 310 initiates a transactionrequesting an execution of an instruction read operation by the memorymodel 315 or 320. Due to the delay introduced as latency by the memorymodel 315 or 320, the instruction cache model responds to thetransaction initiator core model after a time corresponding to the sumof the time necessary for the cache-miss operation and the latency ofthe (hardware) memory.

In more detail, after the latency of the memory elapses, the memorymodel 315 or 320 is copying the read instruction to some address. Uponreceipt of the result of the instruction read operation, the instructioncache model 310 is capable of updating the cached instructions. At thesame time the cache model 310 is copying the requested instruction to aspecified address, for instance, the register of the core model 305supplying the next instruction and replying indicating a COMPLETEDstate.

Referring now to FIGS. 4 a and 4 b, the interfaces of a master model 405and a slave model 410 are shown, enabling the master and the slave modelto initiate/reply to a transaction.

As shown in FIG. 4 a, the interface of the master model 405 is capableof initiating a transaction. The transaction may be used for requestingthe execution of an operation. As an example, a CPU, taking the role ofthe master model 405, can request a cache to provide the nextinstruction. Further, the interface of the master model 405 also definesa reply to a transaction. For example, the reply may indicate one of thefollowing states COMPLETED, ERROR and PENDING, where the COMPLETED statedefines that the operation is successfully completed; the PENDING statedefines that the operation is pending, and the ERROR state defines thatthe execution of the operation has resulted in an error.

As shown in FIG. 4 b, the interface of the slave model 410 is capable ofreceiving a transaction. A transaction to a slave model 410 may requestan operation of the slave model to be executed. Accordingly, upon theslave model 410 receiving a transaction requesting an operation of theslave model 410 to be executed, the slave model 410 processes therequested operation. Slave models may provide different operations, forexample a cache model may provide a cache read operation, a memory mayprovide a memory read and a memory write operation. An operation whichis requested to be executed by a master model may also cause a dependentoperation/multiple dependent operations to be executed. For example, amemory write operation may also cause the respective data to beinvalidated in a cache.

Upon completion of the execution of the requested operation by the slavemodel 410 and the completion of other, dependent operations by othermodels, the interface of the slave model defines the reply to indicatethe COMPLETED state. Further, if the operation or any dependentoperation cannot be immediately processed (e.g. the slave model is acycle-based mode) the interface of the slave model defines the reply toindicate a PENDING state. If any transaction results in an error, theinterface of the slave model defines the reply to indicate an ERRORstate.

In this exemplary embodiment of the present invention, there is nodistinction between a functional master model or a cycle-based mastermodel, or between a functional slave model or a cycle-based slave modelbecause all master models and all slave models implement the sametransaction interface. In particular, the functional master model andthe cycle-based master model implement the same interface, namely theinterface illustrated by FIG. 4 a. Further, the functional slave modeland the cycle-based slave model implement the same interface, namely theinterface illustrated by FIG. 4 b.

Referring now to FIG. 5, the procedure of executing a requestedoperation in a cycle-based slave model 505 is shown.

The slave model shown in FIG. 5 is a cycle-based model. In contrast to afunctional slave model for which a received transaction triggers theexecution of a requested operation and the reply to the transaction, inthe cycle-based model, the execution is timed according to a main clockwhich can be e.g. the system clock or a pre-scaled system clock or adifferent timing mechanism.

When a transaction is received by the cycle-based slave model 505 attime point T₀, the cycle-based slave model registers the transaction aspending transaction for scheduling the execution of the requestedinternal operations. The scheduling is performed by a simulation engine.Upon a successful registration of the received transaction as a pendingtransaction, the cycle-based slave model replies to the transactioninitiator indicating a PENDING state. In the case where there is anerror in the transaction, the reply to the transaction initiatorindicates an ERROR state. An error may result from, for example, areference to an address where there is no devices mapped onto, or if thesize (number of bytes involved in the transfer) is not supported by theslave device.

The reply to the transaction indicating the PENDING state is immediatelytransmitted by the cycle-based slave model 505 to the master modelrequesting the execution of the operation, namely within the same clockcycle T_(C). With the reply, the control is handed over to the mastermodel by a return operation indicating the PENDING state.

Due to the cycle-based slave model 505 registering the transaction as apending transaction, the simulation engine of the simulation systemstarts scheduling the execution of the requested operation at time pointT₀+T_(C). The scheduling is performed in two steps.

First, the simulation engine calls the eval( ) function of thecycle-based slave model 505 e.g. for collecting the inputs for therequested operation. Within the eval( ) operation, also othercomputations may be performed by the cycle-based slave model 505.However, within the eval( ) function the observable state of thecycle-based slave model 505 must not be changed.

Thereafter, the simulation engine calls the commit( ) function forchanging the observable state of a cycle-based slave model 505. Thus,the processing of the eval( ) function has completed when the commit( )function of the cycle-based slave model 505 is scheduled to be executed.As an example, the commit( ) function of a cycle-based slave model maycopy bytes from a memory to some predefined address or trigger acallback mechanism.

In the simulation of embodiments of the present invention, thecycle-based models employ the eval( ) and the commit( ) function inorder to simulate a rising clock edge triggering the devices thatoperate in parallel. The cycle-based models are scheduled. Thescheduling consecutively processes registered pending transactions. Toavoid the destruction of input data by a cycle-based model modifying anaccessible state, the processing of each transaction is separated in theeval( ) and in the commit( ) function which are scheduled by thesimulation engine consecutively. Accordingly, the simulation enginefirst executes the eval( ) function of all registered transactions forthe cycle-based models before executing the commit( ) function of allregistered transactions.

In the example of a cycle-based slave model 505 shown in FIG. 5, thesimulation engine schedules eval( ) and commit( ) function for threecycles T_(C), namely at time points: T₀+T_(C), T₀+2T_(C), T₀+3T_(C)During the third execution of the commit( ) function, namely at timepoint T₀+3T_(C), the result to the transaction is determined. Thereupon,the cycle-based slave model employs a callback mechanism to use acallback function to return to the transaction initiator model with aresult indicating a COMPLETED state. Upon successful completion, thetransaction is deregistered from execution for the cycle-based slavemodel.

In particular, the simulation employs the callback mechanism for acycle-based slave model to reply to the master model initiating therespective transaction. As the cycle-based slave model has alreadyreturned by indicating the PENDING state, the callback mechanismprovides a different, asynchronous method for transferring the controlback to the master model. In particular, a master model passes uponinitiation of a transaction a function pointer to a callback function tobe processed, upon completion of the transaction by the slave model. Thefunction pointer can be used by the slave model to communicate to theinitiator model that the transaction is finished.

As becomes apparent from the above description regarding FIG. 5, eachcycle-based model is capable of registering and deregistering to asimulation engine to schedule the execution of a transaction requestinga specific operation to be executed. The cycle T_(C) according to whichthe execution of the transaction is scheduled determines the executionfrequency. A cycle-based model may have different execution frequencies.Accordingly, each cycle-based model has a predefined cycle T_(C) whichis an integer multiple of the cycle T_(M) of a main clock. Inparticular, the cycle main clock T_(M) is defined such thatT_(M)=N·T_(C) is true for the T_(C) of all cycle-based models and N isan integer≧1.

Although not illustrated in FIG. 5, multiple transactions may beregistered to be scheduled by the simulation engine for one cycle-basedmodel.

Referring now to FIG. 6, an exemplary timing diagram of a simplifiedcache-miss operation by an instruction cache taking the role of a masterand a slave model is shown. This example also illustrates the timingregarding the cache-miss operation introduced with respect to FIG. 3.

As can be seen from FIG. 6, at time point T₀, core model 605 isinitiating transaction T61 requesting an instruction fetch operation toinstruction cache model 610. The instruction cache model 610 is realizedin this example as a functional model. Accordingly, instruction cacheimmediately determines if the requested instruction is present in thecache. In the example, the requested instruction is not present (or notvalid) in the instruction cache model 610. Thus, instruction cache model610 initiates transaction T62 requesting an instruction read operationto memory model 615.

The memory model 615 of this example is realized as a cycle-based model.Accordingly, the memory model 615 receives the transaction requestingthe instruction read and registers this PENDING transaction to bescheduled by the simulation engine. Within the same clock cycle T_(C),the memory model 615 replies to the cache model 610 indicating a PENDINGstate. Due to the cache model 610 receiving the reply indicating PENDINGoperation, the cache model 610 is suspended until a callback to thecache model 615 is triggered. For suspending a functional model, theparameters of a functional model are saved. Additionally the functionalmodel replies to its transaction initiator, in this example the coremodel 605, indicating also the PENDING state.

Due to the memory model 615 registering the transaction requesting theexecution of an instruction read operation, the simulation engine—in theexample the latency of the memory model corresponds to threecycles—schedules for the three consecutive cycles T₀+T_(C), T₀+2T_(C),and T₀+3T_(C) the execution Ex63, Ex64 and Ex65, of first an eval( ) andthen a commit( ) function.

At time point T₀+3T_(C), the execution of the commit( ) function of thememory model 615 results in a completion of the instruction readoperation. Accordingly, the memory model 615 copies the requestedinstruction to some address of the instruction cache 610. Additionally,the memory model 615 employs the callback mechanism to reply to theinstruction cache indicating a COMPLETED state. Upon receipt of theresult indicating the completion of the instruction read operation, theinstruction cache model 610 may update the cached instructions. At thesame time the instruction cache model 610 is copying the requestedinstruction to a specified address, for instance, the register of theCORE model 605 supplying the next instruction and replying thereto alsovia callback mechanism indicating a COMPLETED state. Since theinstruction cache model 610 is a functional model, the reply includestime information on the time which the execution of the requested cacheread operation would have taken for a (hardware) device. In the example,the reply includes time information indicating additional N cycles.

Referring now to FIG. 7, a sequence of operations to be performed by amaster model upon receipt of a result as a reply to a transaction isshown.

As shown in FIG. 7, the master model on the left side initiates atransaction T705 requesting the execution of an operation to a slavemodel on the right side. Thereupon, the slave model executes therequested operation and replies to the transaction T710 including aresult of the requested operation.

Upon receipt of the transaction T710 including the result of therequested operation, the master model determines if the result indicatesa PENDING state. If the result is determined to indicate a PENDING state(YES), the master model is suspended until the callback mechanism istriggered for the master model (step S715).

If the master model determines that the result does not indicate aPENDING state (NO), the master model determines if the result indicatesan ERROR state. If the result is determined to indicate the ERROR state(YES), the transaction generated an error and the master model mayperform error handling to recover the erroneous state in the slave model(step S720).

If the master model determines that the result does not indicate anERROR state (NO), the master model determines if the result indicates aCOMPLETED state. A result of the determining operation that the resultof the transaction does not indicate the COMPLETED state is animpossible situation (S725).

If the result is determined to indicate the COMPLETED state (YES), andif the result is determined not to include a number of cycles (NO), theexecution of the requested operation is indicated to have successfullycompleted (step S730). Thereafter, the master model continues processingoperations.

If the result is determined to indicate the COMPLETED state (YES), andif the result is determined to include a number of cycles (YES), themaster model detects the number of cycles to be included by the slavemodel in the reply to the transaction requesting the execution of anoperation.

If the master model is a functional model, the master model adds thenumber of cycles received from the slave model to a number of cyclesconsumed by the master model for previous operations (step S735). Thesum of a received number of cycles and the internal number of cycles maybe included in a response to a transaction where the master model istaking the role of a slave model.

If the master model is a cycle-based model, the master model issuspended for the number of clock cycles returned by the slave modelplus the number of cycles consumed by the master model itself (stepS740).

In order to further illustrate the advantages of the simulationaccording to the different embodiments of the present invention, anexample of a core model, of an instruction cache model and of a memorymodel is provided in a pseudo code language. These models only implementa minimum of functionality and are incorporated to illustrate theinstruction and data flow between models. In the following, first a coremodel is described, thereafter an instruction cache model and last amemory model is introduced.

The following Source Code Block 1 illustrates an exemplaryimplementation of a cycle-based model according to first and secondembodiments of the present invention. In particular, the source codeblock 1 describes a core model in line with the core model 305 of theexemplary embodiment of FIG. 3 and the core model 605 of the exemplaryembodiment of FIG. 6.

1 void reset( ) 2 { 3  current_stage = 0; 4 } 5 6 7 void clock_eval( ) 8{ 9  /* not necessary in this simple example */ 10 } 11 12 /* one stageper clock cycle, unless in the case of stalls */ 13 void clock_commit( )14 { 15  byte buffer[4];   /* 32-bit instructions */ 16 17  ret =COMPLETED; 18 19  switch (current_stage) 20  { 21   case 0: 22    ret =fetch(PC, buffer, fetch_callback); 23    break; 24 25   case 1: 26   inst = decode(buffer); 27    break; 28 29   case 2: 30    ret =exec(inst, exec_callback); 31    break; 32 33  default: 34   assert(0);35 } 36 37 if ((ret is COMPLETED) || (ret is ERROR)) 38 { 39  if (ret isERROR) 40  { 41   current_stage = 0; /* abort instruction */ 42 43  treat_error( ); /* e.g. raise exception */ 44  } 45  else 46  { 47  /* go to the next pipeline stage */ 48   current_stage++; 49 50   if(current_stage == 3) 51   { 52    commit_instruction(inst); 53 54   current_stage = 0; 55   } 56    } 57  } 58  else 59  { 60   /* it isuseless to be clocked if we have to wait for a 61 transaction tocomplete, and this will happen when one of the two 62 callbacks iscalled, and the callback will reactivate the clock */ 63  suspend_clock( ); 64  } 65 } 66 67 68 mem_ret_t fetch(address, buffer,callback) 69 { 70  mem_ret_t ret = next_device->read(address, buffer,callback); 71 72  if (ret is PENDING) 73  { 74   save_params(address,buffer, callback); 75  } 76 77  return ret; 78 } 79 80 81 instdecode(buffer) 82 { 83  return_instruction_encoded_in_buffer( ); 84 } 8586 87 mem_ret_t exec(inst, callback) 88 { 89  if ((inst is LOAD) ||(inst is STORE)) 90  { 91   mem_ret_t ret; 92 93   if (inst is LOAD) 94  { 95    ret = next_device->read(inst->address, inst->buffer,   inst->callback); 96   } 97   else if (inst is STORE) 98   { 99    ret= next_device->write(inst->address, inst->buffer,    inst->callback);100   } 101 102   if (ret is PENDING) 103   { 104   save_params(inst->address, inst->buffer, inst->callback); 105   } 106107   return ret; 108  } 109  else 110  { 111   /* in this case wesuppose the instruction does not involve any 112   memory operations */113   execute_inst(inst); 114 115   return COMPLETED(0); 116  } 117 }118 119 120 void fetch_callback(mem_ret_t ret) 121 { 122  /* perform allother operations needed when completing the  fetch stage */ 123 124  /*go to the next pipeline stage */ 125  current_stage =(current_stage + 1) % 3; 126 127  reactivate_clock( ); 128 } 129 130 131void exec_callback(mem_ret_t ret) 132 { 133  /* perform all otheroperations needed when completing the  execution stage */ 134 135  /* goto the next pipeline stage */ 136  current_stage++; 137 138  if(current_stage == 3) 139  { 140   commit_instruction(inst); 141 142  current_stage = 0; 143  } 144 145  reactivate_clock( ); 146 }

Source Code Block 1

A core in the hardware platform to be simulated can be understood as aprocessing unit. The core fetches an operation, decodes the fetchedoperation and then executes the decoded operation. The instructions arenormally provided from an instruction cache or a memory holding theprogram.

The core model of Source Code Block 1 also realizes the same sequence ofoperations as a cycle-based model. In particular, the pseudo code modelof Source Code Block 1 with an implementation of a core modeldistinguishes between three stages for the fetch, decode, and executeoperation. For this purpose, the cycle-based core model comprises astate variable named current_stage. During initialization or for asystem reset the state variable current_stage is reset (cf. Source CodeBlock 1, lines 1-4).

During simulation, the simulation engine executes for every registeredcycle-based model the eval( ) and the commit( ) function in a schedulecorresponding to a predefined cycle T_(C). In the particular case, thefunctions are called clock_eval( ) and clock_commit( ) The clock_eval( )function of the core model is empty (cf. Source Code Block 1, lines7-10). The clock_commit( ) function determines first the next stage tobe processed and executes the according function, namely the fetch( ),the decode( ) or the exec( ) function (cf. Source Code Block 1, lines19-35).

As for instance the fetch( ) function of the core model initiates atransaction requesting the execution of an instruction fetch operationby an instruction cache, the clock_commit( ) function also includes asection (cf. Source Code Block 1, lines 37-65) for distinguishing and/orprocessing the result received as a reply to the initiated instruction.In particular, the core model distinguishes, if the received resultindicates the COMPLETED state, between the different stages the coremodel can be. If the core model is in the fetch or the decode stage, aresult indicating a COMPLETED state results in the core model proceedingto the next stage. If the core model is in the execute stage, a resultindicating a COMPLETED state results in the core model first executing acommit_instruction( ) function before proceeding to the first stage,namely the fetch stage (cf. lines 37-57). In the case where the receivedresult indicates an ERROR state, the core model performs error handlingand aborts the processing of last instruction.

Only if the received result indicates a PENDING state, the scheduling bythe simulation engine is interrupted and the core model is suspended(cf. Source Code Block 1, lines 58-65). By a PENDING state a slave modelto which a transaction requesting the execution of an operation has beeninitiated indicates that the execution has not completed. In hardware, acore would stall issuing NOP-operations. Yet, in the simulation, thecore model can be suspended, reducing the simulation load. For resumingafter a suspended state with the stage the core model was previouslyexecuting, two of the three functions of the core model, namely thefetch( ) and the exec( ) function, have an associated callback function,namely fetch_callback( ) and exec_callback( )

Specifically, the fetch stage, implemented by the fetch( ) function inthe core model of Source Code Block 1, issues a read operation to thenext device (cf. Source Code Block 1, line 71). This next device can be,for example, a model of an instruction cache. In this case, the nextdevice is replying with a result indicating the PENDING state, theparameters are saved (cf. Source Code Block 1, lines 73-77), and thecore model is suspended (cf. Source Code Block 1, line 64). In the casewhere the next device is replying with a result indicating the COMPLETEDstate, the core model proceeds to the next stage.

Further, the decode stage, implemented by the decode( ) function in thecore model returns the instruction encoded in a buffer (cf. Source CodeBlock 1, lines 82-85).

The execute stage, implemented by the exec( ) function in the core modelof Source Code Block 1, distinguishes between load/store operations andother operations. In particular, if the decoded instruction isdetermined to either be LOAD or STORE the corresponding transaction to anext device is initiated, namely for requesting the execution of a loador a store operation. For the example, the next device is a memory-likedevice. In this case, the next device is replying with a resultindicating the PENDING state, the parameters are saved (cf. Source CodeBlock 1, lines 103-106), and the core model is suspended (cf. SourceCode Block 1, line 64). In this case, the next device is replying with aresult indicating the COMPLETED state, the core model proceeds to thenext stage. Alternately, other instructions are executed by theexecute_inst( ) function (cf. Source Code Block 1, line 114).

For the callback mechanism of a cycle-based model, the core modelprovides two associated callback functions, namely fetch_callback( ) andexec_callback( ) The fech_callback( ) function proceeds to the nextstage of the core model and executes the reactivate_clock( ) functionwhich reactivates the scheduling by the simulation engine according tothe predefined cycle T_(C) (cf. Source Code Block 1, lines 121-129).Similarly, the exec_callback( ) function increments the stage counter toproceed to the next stage and if the core model is determined to proceedwith the fetch stage, the exec_callback( ) function also executes acommit_instruction( ) function. Further, the core model also executesthe reactivate_clock( ) function which reactivates the scheduling by thesimulation engine according to the predefined cycle T_(C) (cf. SourceCode Block 1, lines 133-149).

The exemplary core model of Source Code Block 1 can be used for asimulation of the hardware platform described with respect to FIGS. 3and 6. The interaction of the core model of Source Code Block 1 withother models is explained in the following description.

As described above, the core model 605 of FIG. 6, initiates at timepoint T₀ transaction T61 requesting an instruction fetch operation toinstruction cache model 610. Transaction T61 corresponds to theimplementation of core model of Source Code Block 1 executing thenext_device->read( ) function (cf. Source Code Block 1, line 71). Whenthe core model 605 of FIG. 6 receives a reply indicating a PENDINGstate, the implementation of the core model of Source Code Block 1 wouldsave the parameters (cf. Source Code Block 1, line 75) and would suspendthe core model by suspending the clock (cf. Source Code Block 1, line64).

When the core model 605 of FIG. 6 receives the callback at time pointT₀+3T_(C), the implementation of core model of Source Code Block 1 wouldproceed to the next stage and would execute the reactivate_clock( )function for reactivating the scheduling by the simulation engine (cf.Source Code Block 1, line 129).

The following Source Code Block 2 illustrates an exemplaryimplementation of a functional model according to the first and secondembodiments of the present invention. In particular, the Source CodeBlock 2 describes an instruction cache model in line with theinstruction cache model 310 of the exemplary embodiment of FIG. 3 andthe instruction cache model 610 of the exemplary embodiment of FIG. 6.

1 mem_ret_t icache_read(address, size, buffer, callback) 2 { 3  line =identify_target_line(address); 4 5  if (line->valid && (line->tag ==get_tag(address)) 6  { 7   copy_bytes(address, size, buffer, line); 8 9  return COMPLETED(L * clock_ratio);  /* we must imagine 10 that thesame device can be used with different clock ratios (clock 11 ratio =main clock / device clock), this means that its latency is 12 always Ldevice cycles, but the returned value is related to the main clock */ 13 } 14  else 15  { 16   line->tag = get_tag(address); 17 18   mem_ret_tret = next_device->read( align_address(address), 19 LINE_SIZE, line,icache_callback); 20 21   if (ret is ERROR) 22    returnERROR(get_error_code(ret)); 23   else if (ret is COMPLETED) 24   { 25   copy_bytes(address, size, buffer, line); 26 27    returnCOMPLETED(get_cycles(ret) + L * clock_ratio); 28   } 29   else if (retis PENDING) 30   { 31    save_params(address, size, buffer, callback);32 33    return PENDING; 34   } 35  } 36 } 37 38 39 voidicache_callback(mem_ret_t ret) 40 { 41  if (ret is ERROR) 42  caller_callback(ERROR(get_error_code(ret)); 43  else if (ret isCOMPLETED) 44  { 45    copy_bytes(address, size, buffer, line); 46 47   caller_callback(COMPLETED(get_cycles(ret) + L *    clock_ratio)); 48  } 49   else if (ret is PENDING) 50    assert(0); 51 }

Source Code Block 2

The instruction cache model of Source Code Block 2 shows the behavior ofthe instruction cache upon a master model (e.g. a CPU) initiating aninstruction cache read operation (named icache_read( )). In the case ofan instruction cache read operation to a specific address, the modelprovides for two alternative behaviors.

Firstly, if the address is contained in a line of the cache and the lineis marked as being valid (cf. Source Code Block 2, line 5), theinstructions are copied from the cache line to a buffer passed by theinitiator (cf. Source Code Block 2, line 7) and the model returns aCOMPLETED state indicating a successful completion of the instructioncache read operation (cf. Source Code Block 2, line 9). Since theinstruction cache is modeled as a functional model, the model replies tothe transaction with a result including time information indicating thatthe read operation would have taken on a real device L cycles multipliedby some clock ratio so that the returned cycle value is related to themain clock (cf. Source Code Block 2, line 9).

Secondly, if the address is not registered in the buffer or if the lineis not valid, the model of the instruction cache redirects the readoperation to a next device (cf. Source Code Block 2, line 19). There aretwo different replies possible, which the above illustrated instructioncache model can cope with.

In the case where the next device to which the instruction readoperation is redirected and all other devices which are additionallyrequired for executing the read operation are realized as functionalmodels, the read operation is executed (processed) immediately by themodel of the next device and the other devices and the result isimmediately available together with the reply to the transaction.

In this case, the instruction cache model of Source Code Block 2inspects the instantaneous reply to the transaction initiating the readoperation stored in the return variable ret (cf. Source Code Block 2,line 23). Depending on the state indicated by the return variable ret,the instruction cache model is programmed to perform ERROR handling (cf.Source Code Block 2, lines 23-24), to copy the requested bytes uponreceiving a COMPLETED state (cf. Source Code Block 2, lines 27-29) or totrigger a sleep operation upon receiving a PENDING state (cf. SourceCode Block 2, lines 31-36).

The COMPLETED state can only be sent by a functional model replyinginstantaneously to the transaction initiating a read request. In thiscase, the reply to the model initiating the transaction requesting theinstruction cache read includes the sum of the time information receivedfrom the next device and the L cycle multiplied by some clock ratio(e.g. the L cycles being determined by duration of the cache miss).

In this case the next device, to which the instruction read operation isredirected by the instruction cache model, is realized as a cycle-basedmodel, the cycle-based model will reply to the transaction requestingthe execution of the instruction read operation with an instantaneousreply indicating a PENDING state. Upon a cycle-based model of the nextdevice completing the execution of the operation, the callback mechanismis used.

For the callback mechanism, the instruction cache model provides theicache_callback( ) function (cf. Source Code Block 2, lines 41-54). Uponthe cycle-based model completing the initiated transaction and replyingindicating a COMPLETED state, the callback function provides for asimilar inspection of the return variable ret. Accordingly, upon receiptof the COMPLETED state, the instruction cache model tries to detect timeinformation received from the next device and depending on asuccess/failure replies to the transaction initiator of the instructioncache read operation with a sum of the time information received fromthe next device and the time information indicating the L cyclesmultiplied by some clock ratio (e.g. the L cycles being determined byduration of the cache miss).

The exemplary instruction cache model of Source Code Block 2 can be usedfor a simulation of the hardware platform described with respect toFIGS. 3 and 6. The interaction of the instruction cache model of SourceCode Block 2 with other models is explained in the followingdescription. The core model 605 of FIG. 6 is described to initiatetransaction T61 requesting the instruction fetch operation toinstruction cache model 610. Upon receipt of transaction T61 by theinstruction cache of Source Code Block 2, the instruction cache ofSource Code Block 2 would at first determine if the address suppliedwith the instruction fetch operation was contained in a line and thatline was marked as being valid (cf. Source Code Block 2, line 5).

In the case where this determination results in a cache-miss, theinstruction cache model of Source Code Block 2 would initiate atransaction requesting the execution of a read operation by a nextdevice (cf. Source Code Block 2, line 19). This transaction correspondsto the transaction T62 of FIG. 6.

When instruction cache model 610 of FIG. 6 receives a reply indicating aPENDING state within the same clock cycle, the instruction cache modelof Source Code Block 2 would save the parameters (cf. Source Code Block2, line 33) and would return a result indicating a PENDING state to thecore model (cf. Source Code Block 2, line 35).

The memory model 615 is described as issuing the callback at time pointT₀+3T_(C) in FIG. 6. Accordingly, the instruction cache model of SourceCode Block 2 would proceed to inspect the transaction result and wouldcopy the requested bytes (cf. Source Code Block 2, line 47) and use thecallback mechanism to return to the core model with a reply indicatingthe COMPLETED state and returning L cycles multiplied by some clockratio.

The following Source Code Block 3 illustrates an exemplaryimplementation of a model with a cycle-based and a functionalimplementation of the same operation according to the embodiment 3 ofthe present invention. In particular, the Source Code Block 3 describesa memory model in line with the memory model 315 of the exemplaryembodiment of FIG. 3 and the instruction cache model 615 of theexemplary embodiment of FIG. 6.

In accordance with the above description of embodiments of the presentinvention, the implementation provides for a dynamic cooperation betweenmodels. The models interact initiating and replying to transactions. Asthe different types of models, namely functional and cycle-based models,implement the same interface the two types of models can beinterchangeably used in the simulation. In particular, the model typecan be changed either by dynamically replacing one model type by adifferent model type or by dynamically reconfiguring a model including acycle-based implementation of an operation and a functionalimplementation of the same operation.

For this purpose, the simulation system may define an internal statedetermining which of the models or which of the implementations is usedfor a particular transaction. Instead of an internal state, thesimulation system may also read a configuration file upon startup orexpect a user instruction via an input (e.g. keyboard, mouse, touchscreen). Thereby, a user is enabled to determine the behavior of thesimulation. Alternately, the internal state may be changed depending ona simulation condition, e.g. a predefined simulation duration and/or apredefined simulation result. Thereby, the simulation speed or thesimulation accuracy can be improved as illustrated through the followingpseudo code:

1 mem_ret_t mem_read(address, size, buffer, callback) 2 { 3  if(current_mode_is_functional) 4  { 5   if (new_mode_must_be_cycle_based) // this can be 6 specified by the user for example and can be relatedto a particular 7 clock cycle, i.e. start behaving like a cycle-accuratemodel after cycle C 8   { 9    change_current_mode_to_cycle_based( ); 1011    // we need the clock for implementing the cycle-based model 12   enable_clock( ); 13 14    // start new transactions in cycle-accuratemode 15    return mem_read_cycle_based(address, size, buffer,   callback); 16   } 17   else 18    // start new transactions infunctional mode 19    return mem_read_functional(address, size, buffer,callback); 20  } 21  else 22  { 23   if (new_mode_must_be_functional) 24  { 25    if (no_more_pending_cycle_based_transactions) 26    { 27    change_current_mode_to_functional( ); 28 29     // we don't need theclock for implementing the    functional model 30     disable_clock( );31    } 32 33    // in any case start new transactions in functionalmode 34    return mem_read_functional(address, size, buffer, callback);35   } 36   else 37    // start new transactions in cycle-based mode 38   return mem_read_cycle_based(address, size, buffer,    callback); 39 } 40 } 41 42 43 mem_ret_t mem_read_cycle_based(address, size, buffer,callback) 44 { 45  latency = compute_latency(address, size); 46 47 add_pending_trans(address, size, buffer, callback, latency); 48 49 return PENDING; 50 } 51 52 53 mem_ret_t mem_read_functional(address,size, buffer, callback) 54 { 55  latency = compute_latency(address,size); 56 57 copy_bytes(address, size, buffer); 58 59  returnCOMPLETED(latency * clock_ratio); 60 } 61 62 63 void mem_clock_eval( )64 { 65  for (p = pending_trans; p != NULL; p = p->next) 66  p->count−−; 67 } 68 69 70 void mem_clock_commit( ) 71 { 72  for (p =pending_trans; p != NULL; p = n) 73  { 74   n = p->next; 75 76   if(p->count == 0) 77   { 78    copy_bytes(address, size, buffer); 79 80   remove_pending_trans(p); 81 82    caller_callback(COMPLETED(0)); 83  } 84 }

Source Code Block 3

In the memory model of Source Code Block 3, the behavior of the modelcan be changed according to an internal state of the simulation system,namely state variable new_mode_must_be_cycle_based andnew_mode_must_be_functional. In the case where the state variablenew_mode_must_be_cycle_based is true (cf. Source Code Block 3, line 5),the behavior of the memory model is switched to become a cycle-basedmodel by enabling the clock (cf. Source Code Block 3, line 14) andtriggering the cycle-based implementation of the read operation throughthe function mem_read_cycle_based( ) (cf. Source Code Block 3, line 17).In the case where the state variable new_mode_must_be_functional is true(cf. Source Code Block 3, line 26), the behavior of the memory model isswitched to become a functional model by disabling the clock (cf. SourceCode Block 3, line 14) and initiating the functional implementation ofthe read operation through the function mem_read_functional( ) cf.Source Code Block 3, line 38).

In memory model of Source Code Block 3 a state variable determines ifthe model behaves like a cycle-based model or like a functional model.The behavior may be set by a user for the whole duration of thesimulation. Alternately, a user may also specify the behavior of thememory model to change depending on to a predefined clock cycle of thesimulation clock. Defining a clock cycle of the simulation clock toswitch a model from a cycle-based behavior to a functional behavior mayallow for a faster completion of the simulation after the specifiedclock cycle (e.g. after clock cycle C). Defining a clock cycle of thesimulation clock to switch a model from a functional behavior to acycle-based behavior may allow for a more accurate simulation after thespecified clock cycle (e.g. after clock cycle C where C determines atime point when the simulated hardware platform starts performing a setof instructions which is of interest to a user).

Regarding the cycle-based implementation of the memory model of SourceCode Block 3, the functions mem_read_cycle_based( ) mem_clock_eval( )and mem_clock_commit( ) are important to this cycle-basedimplementation.

In particular, after the determination of the model behavior (cf. SourceCode Block 3, lines 3-45), the cycle-based implementation of the memorymodel first determines the latency of the read operation for simulatingthis latency by the number of pending cycles (cf. Source Code Block 3,line 51). Second, the read operation is registered for the memory readoperation to be scheduled by the simulation engine (cf. Source CodeBlock 3, line 53). Thereafter, the memory model replies with a resultindicating a PENDING state to the model which has initiated the readoperation (cf. Source Code Block 3, line 55).

Further, the memory model of Source Code Block 3 has a mem_clock_eval( )function and a mem_clock_commit( ) function to be executed by thesimulation engine upon the read operation being registered as pendingoperation. Accordingly, for processing the read operation the simulationengine executes the mem_clock_eval( ) function which only decrements theinternal counter simulating the latency of the memory. As there may bemore than one transaction requesting a read operation to the simulatedmemory model, a list of pending transactions is used for storing eachtransaction requesting a read operation. This list is used to iteratethrough the pending transactions decrementing the internal counter foreach of the pending transactions (cf. Source Code Block 3, lines 69-70).

The mem_clock_commit( ) function of the memory model of Source CodeBlock 3 implements a reply to the transaction requesting the readoperation. For the pending transactions, the memory model determines ifthe internal counter has become zero which indicates that the latency ofthe memory has elapsed (cf. Source Code Block 3, lines 74-80). If thecounter has become zero, the bytes to be read are copied to thespecified address (cf. Source Code Block 3, line 82), the transaction isderegistered (i.e. removed) from the list of pending transactions (cf.Source Code Block 3, line 84) and the callback mechanism is executed toreturn to the model initiating the transaction requesting the readoperation with a result indicating a COMPLETED state. The result to themodel initiating the transaction also includes a zero to indicate thatthe operation has already completed.

Regarding the functional implementation of the memory model of SourceCode Block 3, the function mem_read_functional( ) is an importantfunction.

After the determination of the model behavior (cf. Source Code Block 3,lines 3-45), the functional implementation of the memory model of SourceCode Block 3 first determines the latency of the read operation to besimulated (cf. Source Code Block 3, line 59), second copies the bytes tobe read to the specified address (cf. Source Code Block 3, line 61) andthereafter returns to the initiating model with a result including aCOMPLETED state and time information indicating that the read operationwould have taken LATENCY device cycles (i.e. a number of cyclescorresponding to the determined latency (cf. Source Code Block 3, line65)).

The exemplary memory model of Source Code Block 3 can be used for asimulation of the hardware platform described with respect to FIGS. 3and 6. The interaction of the memory model of Source Code Block 3 withother models is explained in the following description.

For the following example, the memory model of Source Code Block 2 isdetermined to be a cycle-based model. Accordingly, only the functionsmem_read_cycle_based( ) mem_clock_eval( ) and mem_clock_commit( ) areused.

When the instruction cache model 610 of FIG. 6 issues at time pointT₀+T_(C) transaction T62 requesting the instruction read operation, thememory model of Source Code Block 3 would register a PENDING transactionto be scheduled by the simulation engine by the add_pending_trans( )function (cf. Source Code Block 3, line 51).

Thereafter, the memory model Source Code Block 3 would immediately replyto the instruction cache model indicating a PENDING state (cf. SourceCode Block 3, line 53).

Due to the memory model of Source Code Block 3 registering thetransaction requesting the execution of an instruction read operation inthe list of pending transactions, the simulation engine—with a latencyof three cycles—would schedule the execution of the processing of thetransaction for the three consecutive cycles, for each cycle first themem_clock_eval( ) is called and then the mem_clock_commit( ) function iscalled. (cf. Source Code Block 3, lines 68-83).

The third execution of the mem_clock_commit( ) function of the memorymodel of Source Code Block 3 would result in the completion of theinstruction read operation. The memory model of Source Code Block 3would copy the requested instruction to some address of the instructioncache (cf. of Source Code Block 3, line 82). Additionally, the memorymodel of Source Code Block 3 would deregister the transaction from thelist of pending transactions (cf. Source Code Block 3, line 84) andwould employ the callback mechanism to reply to the instruction cacheindicating a COMPLETED state with zero cycles (cf. Source Code Block 3,line 86).

One skilled in the art will understand that even though variousembodiments and advantages of the present invention have been set forthin the foregoing description, the above disclosure is illustrative only,and changes may be made in detail, and yet remain within the broadprinciples of the disclosed invention. Therefore, the inventiondisclosed in the present application is to be limited only by theappended claims.

1. A computer-implemented method for simulating a multi-core hardwareplatform including a plurality of devices, each device being representedin the simulation by either a functional model or a cycle-based model,and the method being run on a simulation system and the methodcomprising the operations of: initiating a transaction by a model takingthe role of a master model to request the execution of an operation by amodel taking the role of a slave model, executing the requestedoperation by the slave model, and replying to the transaction by theslave model by returning a result of the executed operation to themaster model; wherein when the slave model is a functional model, theslave model in the simulation being adapted to execute the operationrequested by the transaction and immediately reply thereto by returningthe result of the executed operation and information on the executiontime of the operation, and wherein the execution time indicates anestimated number of cycles of a main clock which the device representedby the functional slave model would require for executing the operation.2. The computer-implemented method according to claim 1, wherein whenthe slave model is a cycle-based model, a simulation engine of thecomputer implemented method schedules the execution of the operationrequested by the transaction and the reply thereto relative to thecycles of a main clock.
 3. The computer-implemented method according toclaim 2, wherein each cycle-based model has a predefined cycle TC whichis an integer multiple of a cycle TM of the main clock, and thesimulation engine is adapted to schedule the execution of an operationrequested by a transaction and a reply thereto of each of thecycle-based models relative to the respective cycle TC.
 4. Thecomputer-implemented method according to claim 1, wherein the mastermodel is a cycle-based master model, and wherein upon receipt of thereply to the transaction including the result and the information on theexecution time, the master model is suspended for a number of cycles ofthe main clock corresponding to the execution time indicated in thereceived information.
 5. The computer-implemented method according toclaim 1, wherein the master model is a functional model and the mastermodel takes the role of a slave model for another master modelrepresenting a device of the simulated hardware platform, the othermaster model initiating another transaction for requesting the executionof an operation by the master model, and wherein upon receipt of thereply to the transaction including the result and the information on theexecution time, the master model executes the operation requested by theother transaction and immediately replies thereto by returning theresult of the execution of the different operation and the sum of thereceived number of cycles and of the estimated number of cyclesassociated with the execution of the operation as information on theexecution time.
 6. The computer-implemented method according to claim 2,wherein the simulation engine is adapted to schedule the execution of anoperation requested by a transaction and a reply thereto of each of thecycle-based models at different points in time within a cycle of themain clock.
 7. The computer-implemented method according to claim 1,wherein the result which is returned by a slave model as a reply to atransaction requesting the execution of an operation indicates one ofthe following states: COMPLETED state, where the operation issuccessfully completed; PENDING state, where the operation is pending;and ERROR state, where the execution of the operation results in anerror.
 8. The computer-implemented method according to claim 7, whereinthe simulation engine is adapted to suspend a master model upon themaster model receiving as a reply to a transaction requesting theexecution of an operation of a slave model a result indicating a PENDINGstate.
 9. A computer-implemented method for simulating a multi-corehardware platform comprising a plurality of devices, each device beingrepresented in the simulation by either a functional model and/or acycle-based model, wherein at least one device of the hardware platformis represented by both a functional model and a cycle-based model, thefunctional model and the cycle-based model having a common interface,and the method being run by a simulation system that executes theoperations of: initiating a transaction by a model taking the role of amaster model to request the execution of an operation by one of thefunctional model and the cycle-based model representing the same deviceof the hardware platform, determining according to an internal state ofthe simulation system which one of the two models is used as slave modelfor the device, executing the requested operation by the determinedslave model, and replying to the transaction by the slave modelreturning a result of the executed operation to the master model. 10.The computer-implemented method according to claim 9, wherein when theslave model is a functional model, the slave model in the simulation isadapted to execute the operation requested by the transaction andimmediately reply thereto by returning the result of the executedoperation and information on the execution time, and wherein theexecution time indicates an estimated number of cycles of a main clockwhich the device represented by the functional slave model would haverequired for executing the operation.
 11. The computer-implementedmethod according to claim 10, wherein when the slave model is acycle-based model, a simulation engine of the computer implementedmethod schedules the execution of the operation requested by thetransaction and the reply thereto relative to the cycles of a mainclock.
 12. A computer-implemented method for simulating a multi-corehardware platform comprising a plurality of devices, each device beingrepresented in the simulation by either a functional model and/or acycle-based model, wherein at least one device of the hardware platformis represented by a model including a cycle-based implementation of anoperation and a functional implementation of the same operation, themethod being run by a simulation system and the method comprising theoperations of: initiating a transaction by a model taking the role of amaster model to request the execution of an operation by a model takingthe role of a slave model, the slave model including a cycle-basedimplementation of the requested operation and a functionalimplementation of the same operation, determining according to aninternal state of the simulation system which one of the twoimplementations is to be used by the slave model for executing therequested operation, executing the requested operation by the slavemodel using the determined implementation of the slave model, andreplying to the transaction by the slave model returning a result of theexecuted operation to the master model.
 13. The computer-implementedmethod according to claim 12, wherein when the slave model is afunctional model, the slave model in the simulation is adapted toexecute the operation requested by the transaction and immediately replythereto by returning the result of the executed operation andinformation on the execution time, and wherein the execution timeindicates an estimated number of cycles of a main clock which the devicerepresented by the functional slave model would have required forexecuting the operation.
 14. The computer-implemented method accordingto claim 13 wherein when the slave model is a cycle-based model, asimulation engine of the computer implemented method schedules theexecution of the operation requested by the transaction and the replythereto relative to the cycles of a main clock.
 15. Thecomputer-implemented method according to claim 13, wherein eachcycle-based model has a predefined cycle TC which is an integer multipleof a cycle TM of the main clock, and the simulation engine is adapted toschedule the execution of an operation requested by a transaction and areply thereto of each of the cycle-based models relative to therespective cycle TC.
 16. The computer-implemented method according toclaim 13, wherein the master model is a cycle-based master model, andwherein upon receipt of the reply to the transaction including theresult and the information on the execution time, the master model issuspended for a number of cycles of the main clock corresponding to theexecution time indicated in the received information.
 17. Thecomputer-implemented method according to claim 13, wherein the mastermodel is a functional model and the master model takes the role of aslave model for another master model representing a device of thesimulated hardware platform, the other master model initiating anothertransaction for requesting the execution of an operation by the mastermodel, and wherein upon receipt of the reply to the transactionincluding the result and the information on the execution time, themaster model executes the operation requested by the other transactionand immediately replies thereto by returning the result of the executionof the different operation and the sum of the received number of cyclesand of the estimated number of cycles associated with the execution ofthe operation as information on the execution time.
 18. Thecomputer-implemented method according to claim 13, wherein thesimulation engine is adapted to schedule the execution of an operationrequested by a transaction and a reply thereto of each of thecycle-based models at different points in time within a cycle of themain clock.
 19. The computer-implemented method according to claim 13,wherein the result which is returned by a slave model as a reply to atransaction requesting the execution of an operation indicates one ofthe following states: COMPLETED state, where the operation issuccessfully completed; PENDING state, where the operation is pending;and ERROR state, where the execution of the operation results in anerror.
 20. The computer-implemented method according to claim 19,wherein, the simulation engine is adapted to suspend a master model uponthe master model receiving as a reply to a transaction requesting theexecution of an operation of a slave model a result indicating a PENDINGstate.
 21. A computer-readable storage medium holding a computer programfor simulating a multi-core hardware platform including a plurality ofdevices, each device being represented in the simulation by either afunctional model or a cycle-based model, and the program operable toperform the operations of: initiating a transaction by a model takingthe role of a master model to request the execution of an operation by amodel taking the role of a slave model; executing the requestedoperation by the slave model; and replying to the transaction throughthe slave model returning a result of the executed operation to themaster model; and wherein when the slave model is a functional model,the slave model in the simulation is adapted to execute the operationrequested by the transaction and immediately reply thereto by returningthe result of the executed operation and information on the executiontime of the operation, the execution time indicating an estimated numberof cycles of a main clock which the device represented by the functionalslave model would require for executing the operation.
 22. A computersystem, comprising: a simulation system operable to simulate amulti-core hardware platform, the multi-core hardware platformincluding, a plurality of devices, each device represented in thesimulation system through a corresponding functional or cycle-basedmodel, and at least some of the models in the simulation system beingoperable to: initiate a transaction through a first model that providesa transaction to a second model, with the first mode that initiates thetransaction being a master model and the second model that receives thetransaction being a slave model, and the transaction requesting theslave model to execute a corresponding operation and the slave model,upon executing the operation, providing a reply to the transaction tothe master model that includes a result of the executed operation, andthe slave model being operable, when the slave model is a functionalmodel, to immediately reply to the transaction from the master model byreturning the result of the executed operation and information about theexecution time of the executed operation, where the execution timeindicates an estimated number of cycles of a main clock which the devicerepresented by the functional slave model would require for executingthe operation.
 23. The computer system of claim 22, wherein the computersystem includes a general purpose computer on which the simulationsystem executes.
 24. The computer system of claim 22, wherein themulti-core hardware platform corresponds to one of a multimedia device,a television, a multi-channel HIFI system, a networking device, a mobilephone, a personal digital assistant, an MP3 player, and a generalpurpose computer.
 25. The computer system of claim 22, wherein themultimedia device comprises one of a DVD player, Blu-Ray player, andhard-drive digital video recorder.
 26. The computer system of claim 22,wherein at least some of the master models are a DMA controller or acache memory.
 27. The computer system of claim 22, wherein at least someof the slave models correspond to a bus, a main memory, anetwork-on-chip, or a bridge device.
 28. The computer system of claim22, wherein at least some of the slave models are cycle-based models andwherein for each cycle-based slave model the simulation system schedulesthe execution of the operation requested by the transaction and thereply thereto by the slave model relative to the cycles of a main clock.29. The computer system of claim 28, wherein each cycle-based model hasa predefined cycle TC which is an integer multiple of a cycle TM of themain clock.
 30. The computer system of claim 29, wherein the simulationsystem is adapted to schedule the execution of an operation requested bya transaction and a reply thereto for each of the cycle-based modelsrelative to the respective cycle TC.
 31. The computer system of claim22, wherein each master model is a cycle-based master model; and whereinupon receipt of the reply to the transaction including the result andthe information on the execution time, the master model is suspended fora number of cycles of the main clock corresponding to the execution timeindicated in the received information.
 32. The computer system of claim22, wherein each master model is a functional model and the master modeltakes the role of a slave model for another master model representing adevice of a simulated hardware platform corresponding to the simulationsystem, the other master model initiating another transaction forrequesting the execution of an operation by the master model, andwherein upon receipt of the reply to the transaction including theresult and the information on the execution time, the master modelexecutes the operation requested by the other transaction andimmediately replies thereto by returning the result of the execution ofthe different operation and the sum of the received number of cycles andof the estimated number of cycles associated with the execution of theoperation as information on the execution time.
 33. The computer systemof claim 22, wherein the simulation system is operable to schedule theexecution of an operation requested by a transaction and a reply theretofor each of the cycle-based models at different points in time within acycle of the main clock.
 34. The computer system of claim 22, whereinthe result returned by a slave model includes one of: a COMPLETED state,where the operation has been successfully completed; a PENDING state,where the operation is pending; and an ERROR state, where the executionof the operation results in an error.
 35. The computer system of claim34, wherein the simulation system is operable to suspend a master modelupon the master model receiving a reply that includes a resultindicating a PENDING state.
 36. The computer system of claim 22, whereinat least one device of the multi-core hardware platform is representedthrough both a functional model and a cycle-based model, the functionalmodel and the cycle-based model having a common interface.
 37. Thecomputer system of claim 26, wherein the simulation system determines,from an internal state of the simulation system, which one of the twomodels is to be used for each device that is represented through both afunctional and a cycle-based model. 38-59. (canceled)